Skip to content

Add DataFusionTableWidget::column_visibility and use it to improve visibility defaults for the partition table #9936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 19, 2025

Conversation

abey79
Copy link
Member

@abey79 abey79 commented May 9, 2025

Related

What

Title.

Also fixes the column ordering, which was wrongly based on their id. And rename a bunch of things for consistency.

Current default view:

image

@abey79 abey79 added ui concerns graphical user interface exclude from changelog PRs with this won't show up in CHANGELOG.md dataplatform Rerun Data Platform integration feat-redap-browser Everything related to the in-viewer Redap browser labels May 9, 2025
Copy link

github-actions bot commented May 9, 2025

Web viewer built successfully. If applicable, you should also test it:

  • I have tested the web viewer
Result Commit Link Manifest
4ac93d6 https://rerun.io/viewer/pr/9936 +nightly +main

Note: This comment is updated whenever you push a commit.

@abey79 abey79 marked this pull request as ready for review May 9, 2025 09:55
@abey79 abey79 force-pushed the antoine/partition-table-col-viz branch from ef10d2c to 3d49ec9 Compare May 9, 2025 14:46
@Wumpf Wumpf self-requested a review May 9, 2025 16:08
abey79 added 3 commits May 9, 2025 19:29
…visibility defaults for the partition table

Also fixes the column ordering, which was wrongly based on their id.
@abey79 abey79 force-pushed the antoine/partition-table-col-viz branch from 3d49ec9 to 5f500eb Compare May 9, 2025 17:31
Copy link
Member

@Wumpf Wumpf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm overall!
some code improvement suggestions and questions on exact semantics

Comment on lines 56 to 58
pub type ColumnNameFn<'a> = Option<Box<dyn Fn(&ColumnDescriptorRef<'_>) -> String + 'a>>;

pub type ColumnVisibilityFn<'a> = Option<Box<dyn Fn(&ColumnDescriptorRef<'_>) -> bool + 'a>>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type names are a bit misleading imho. Shouldn't this be

Suggested change
pub type ColumnNameFn<'a> = Option<Box<dyn Fn(&ColumnDescriptorRef<'_>) -> String + 'a>>;
pub type ColumnVisibilityFn<'a> = Option<Box<dyn Fn(&ColumnDescriptorRef<'_>) -> bool + 'a>>;
pub type ColumnNameFn<'a> = Box<dyn Fn(&ColumnDescriptorRef<'_>) -> String + 'a>;
pub type ColumnVisibilityFn<'a> =Box<dyn Fn(&ColumnDescriptorRef<'_>) -> bool + 'a>;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These types are basically to reduce the noise in the struct definition (which clippy complained about). They are not public (I removed that erroneous pub qualifier btw). The actual public interface is the builder method (which spell the entire signature for clarity, since they are simple enough). So since this whole option-box thing is used twice (here and in the delegate), I'd rather leave it as is.

@abey79 abey79 merged commit a3ddd26 into main May 19, 2025
37 checks passed
@abey79 abey79 deleted the antoine/partition-table-col-viz branch May 19, 2025 12:25
Comment on lines +147 to +151
desc.short_name(),
RECORDING_LINK_FIELD_NAME
| DATASET_MANIFEST_ID_FIELD_NAME
| DATASET_MANIFEST_REGISTRATION_TIME_FIELD_NAME
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Matching on string here is extremely brittle and is already causing problems with #9983

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that's a strange comment to make tbh. This code uses short_name, which #9983 removed, hence the compilation error. What is brittle is our lack of standard around column names and how they relate to descriptors—see #9840. I've pushed a commit on #9983 which replaces short_name by display_name and tested that it still works as intended.

Copy link
Member

@emilk emilk May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But does it make sense for this code to break when we change how columns are displayed (e.g. when we change the implementation of fn display_name)? Wouldn't it be better to explicitly match against e.g. ColumnDescriptor::archetype_field_name ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These columns don't have much metadata—basically just a raw record batch built ad hoc on the server. Specifically, they don't have an archetype. Again, we need a much strong system to relate record batch/dataframe column name with column descriptors, which must account for arbitrary tables sent by the server and/or users.

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is Name in the above screenshot? Is that ComponentName? If so, can we match against that instead of display_name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For component column, it corresponds to component_name (this view is a mirror of the fields of ComponentColumnDescriptor). That field is currently only exposed indirectly in ColumnDescriptorRef's methods. Doing it properly amounts to resolving the desc <-> column name problem.

@@ -115,6 +117,8 @@ impl Server {
ui: &mut egui::Ui,
dataset: &Dataset,
) {
const RECORDING_LINK_FIELD_NAME: &str = "recording link";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this hard-coded string? Where does it come from?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a column that is live injected using datafusion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is it live-injected? Who gives it this name? What other piece of code needs to be updated in tandem with this piece of code?

Copy link
Member Author

@abey79 abey79 May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the location of this const suggests, this is all local to this method. generate_partition_links is what triggers that column generation (it takes a column name as input), and default_column_visibility defines, well, what the columns are visible by default (and it wants that generated column to be visible by default).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dataplatform Rerun Data Platform integration exclude from changelog PRs with this won't show up in CHANGELOG.md feat-redap-browser Everything related to the in-viewer Redap browser ui concerns graphical user interface
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants